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Abstract 

We present SUPERBOX, a particle-mesh code with high resolution sub-grids and an NGP 
(nearest grid point) force-calculation scheme based on the second derivatives of the po- 
tential. SUPERBOX implements a fast low-storage FFT-algorithm, giving the possibility to 
work with millions of particles on desk-top computers. Test calculations show energy and 
angular momentum conservation to one part in 10 5 per crossing-time. The effects of grid 
and numerical relaxation remain negligible, even when these calculations cover a Hubble- 
time of evolution. As the sub-grids follow the trajectories of individual galaxies, the code 
allows a highly resolved treatment of interactions in clusters of galaxies, such as high- 
velocity encounters between elliptical galaxies and the tidal disruption of dwarf galaxies. 
Excellent agreement is obtained in a comparison with a direct-summation N-body code run- 
ning on special-purpose Grape 3 hardware. The orbital decay of satellite galaxies due to 
dynamical friction obtained with SUPERBOX agrees with Chandrasekhar's treatment when 
the Coulomb logarithm In A « 1.5. 
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1 Introduction 



In direct- summation N-body methods, the computational time t C pu scales with the 
square of the particle number iV p , t C pu ~ N^. This relation inevitably leads to 
bottlenecks when computing large-N galaxy models. While it is still impossible to 
model galaxies star by star over sensible time-laps, alternative algorithms have been 
developed over a number of years which ease the computer requirements at the ex- 
pense of the exact summation, yet provide an adequate treatment of Newtonian 
galactic dynamics. Examples of such techniques are the tree method (Barnes & Hut 
1986, Dave, Dubinski & Hernquist 1997), and the multi-grid or nested grid parti- 
cle mesh-codes like Pandora (Villumsen 1989), Hydra (Pearce & Couchman 
1997), Superb ox, or combinations of the methods like the adaptive refinement 
tree (ART) (Kravtsov, Klypin & Khokhlov. 1997), which can cope with very in- 
homogeneous matter distributions and have high spatial resolution at the density 
maxima. 

While the hierarchical tree-method is independent of the geometry of the prob- 
lem, its main disadvantage is the limitation in particle number and the softening 
needed to avoid undesired two-body relaxation. In contrast, the main advantage 
of a particle-mesh technique is the ability to use very high particle numbers (up 
to several millions on desk-top computers), because the CPU-time scales only lin- 
early with the number of particles, and with N gc log N gc , where iVg C is the number 
of grid-cells. High particle numbers keep the statistical noise low. The disadvantage 
of a grid-based method, however, is the dependency on the geometry of the grid. 
But with increasing spatial resolution, the geometry of the cells becomes less im- 
portant. The only limitation in resolving phenomena is the size of one grid-cell. A 
general discussion and comparison of various numerical schemes will be found in 
Sell wood (1987), and - in the context of direct iV-body simulations - in Spurzem 
(1999). 

The basic idea behind Superb ox, as originally conceived by R. Bien (Bien et al. 
1991), is to increase the resolution only at places where it is necessary, while si- 
multaneously keeping the computational overhead as small as possible by using a 
fixed (i.e. not adaptive) nested grid- architecture. Accuracy is improved by the use 
of nested high-resolution sub-grids and a linear force interpolation to the exact po- 
sition of the particle inside a cell. Two higher-resolution sub-grids are introduced: 
the medium-resolution grid contains an entire galaxy, and the high-resolution grid 
treats its core. Both grids stay focused on the galaxy, moving through the 'local 
universe' that is contained in the coarse outermost grid. 

Superbox has already been used successfully in a variety of investigations, namely 
in the study of the high velocity encounter of the two similar early-type galaxies 
NGC 4782/4783 by Madejski & Bien (1993), in the survey of encounters between 
elliptical galaxies by Wassmer et al. (1993), and in the research on tidal disruption 
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of satellite dwarf galaxies by Kroupa (1997) and Klessen & Kroupa (1998). A ver- 
sion of Superb ox exists that includes sticky particles as a simple model of the 
dynamics of cold molecular gas clouds (Fellhauer 1996). 

In this article we provide an implementation and benchmarks for the Superbox 
algorithm. In Section 2, we lay out the mathematics behind the algorithm and give 
details of how the potential is mapped on the different grids. We then move on to 
test the code: Section 3 deals with the conservation of energy and angular momen- 
tum, while Section 4 discusses relaxation. Section 5 contains profiling information 
about storage and CPU-time requirements. Comparison with a direct- summation 
iVp-body code is made in Section 6. In Section 7 we show - as an example - 
the good agreement of satellite decay with Chandrasekhar's formula for dynam- 
ical friction, and close with conclusions in Section 8. 



2 Method 

Detailed discussions of the particle-mesh (PM) technique can be found in Eastwood 
& Brownrigg (1978), Hockney & Eastwood (1981) and Sellwood (1987). 

Briefly, the array of mass densities, q^, is derived by counting the number of parti- 
cles in each Cartesian grid cell i, j, k, i.e. by using the simple particle-in-cell (NGP 
= "Nearest Grid Point") algorithm. All particles belonging to one galaxy iV P5gal 
have the same mass, m = M gal /7Vp )gal . An alternative would be to use a cloud-in- 
cell technique, which is known to improve momentum and energy conservation, in 
which the mass of the particle is distributed proportionally among the neighbouring 
cells. This is, however, not necessary, since the number of particles is large. 

Poisson's equation is solved for this density array to get the grid-based potential, 
&ijk, which becomes, 



where N denotes the number of grid-cells per dimension (N 3 = N gc ), and 
is some Green's function. To avoid this N^ c procedure, the discrete Fast Fourier 
Transform (FFT) is used, for which N = 2 , K > being an integer. The station- 
ary Green's function is Fourier transformed once at the beginning of the calculation, 
and only the density array is transformed at each time-step: 



N-1 



$ijk = G ^ @abc • H ( 



a-i,b-j,c- 



-.-k, hj,k = 0,...,N- 1, 



(1) 



a,b,c=0 




N-1 



(2) 
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JV-1 

Habc = X! Hijk ■ exp ( - 

i,j,k=0 



27r / 

1 — (cm + bj + ck) 



The two resulting arrays are multiplied cell by cell and transformed back to get the 
grid-based potential, 



G N_1 ( 2n 

$ ijk = Qabc ■ H abc ■ exp ( v/^T — (ai + bj + ck) 



(3) 



The potential is differentiated numerically to estimate the accelerations for each 
particle, each of which is pushed forward on its orbit using the leapfrog scheme 
(see Fig. 1 and sections below). 



read model data 


FFT of Green's function 


start time-step loop 






start galaxy loop 






calculate mass densities of all 5 grids 






calculate potentials (FFT) 






force calculation, update velocities 






end galaxy loop 






update positions, output data 




end time-step loop 




write final model data for restart 



Fig. 1. Float-chart for Superbox. 
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2. 1 Green 's function and the FFT 



The simplest Green's function is adopted, in which the side of a grid cell has unit 
length: 

H ^ k = ^/WTF+^' hJ,k = 0,1,. .,N (4) 
tfooo = 4/3. 

The value of if oo is somewhat arbitrary, and H 000 = H 100 is often used (see e.g. 
Sellwood 1987). Numerical tests performed with Superb ox show that for high 
particle numbers, H 000 = 4/3 yields a slightly slower drift in total energy with time 
(see also Section 3) than H 000 = 1 or 1 .5, while for low particle numbers (compared 
to the number of grid-cells) H 000 = 1 is the better choice. The reason for this is that 
for high particle numbers the collective force dominates over the particle-particle 
(pairwise) force. On the other hand, for low particle numbers (only one or a few 
particles are together in the same cell) pairwise forces are more dominant, and it is 
better to exclude 'self-gravity' (i.e. a particle feels its own force inside its cell), with 
ifooo = 1- This is consistent with analytical arguments by D. Pfenniger (private 
communication). He shows that in one dimension, H = In 4 minimises the power 
at the grid wave-number n/2 for the kernel (compare with Eq. 3) 



Hi = \/\i\ , < |z| < n/2, (5) 
Hi = \n4, i = 0. 

He finds that the power is exactly cancelled at the wavenumber n/2, and for finite 
n can also be cancelled for a value of H close to In 4. For H > In 4 the power 
at high frequencies increases, while for H < In 4 some wavenumbers < n/2 are 
filtered out. In this sense, H = In 4 ps 4/3 optimises the kernel. However, the 
3D-problem is slightly different since forces instead of the potential are used. The 
artificial wavenumbers induced by the grid, that are to be minimised, are then much 
more numerous. This problem would need further addressing. 
The FFT-algorithm gives the exact solution of the grid-based potential for a peri- 
odical system. For the exact solution of an isolated system, which is what we are 
interested in, the size of the density-array has to be doubled (2 N), filling all inac- 
tive grid cells with zero density, and extending the Green's function in the empty 
regions in the following way: 



H2n-i,j,k — H2n-i,2n-j,k ~ H 2n -i,j,2n-k ~ H 2n -i,2n-j,2n-k (6) 
Hi t 2n—j,k Hi2n—j,2n—k Hij^n—k H%,j,k- 

This provides the isolated solution of the potential in the simulated area between 
k = and iV — 1. In the inactive part the results are unphysical. To keep 
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storage as low as possible, only a 2N x 2N x iV-array is used for transforming the 
densities, and a (N + 1) x (N + 1) x (N + l)-array is used for the Green's function. 
For a detailed discussion see Eastwood & Brownrigg (1978) and also Hockney & 
Eastwood (1981). 

The FFT-routine incorporated in Superb ox is a simple one-dimensional FFT and 
is taken from Werner & Schabach (1979) and Press et al. (1986). It is fast and makes 
the code portable and not machine specific. The low- storage algorithm to extend the 
FFT to 3 dimensions, to obtain the 3-D-potential, is taken from Hohl (1970). The 
performance of Superbox can be increased by incorporating machine-optimised 
FFT routines. 



2.2 Acceleration and orbit integration 

In this section we explain the force calculation performed in Superbox, and com- 
pare it with standard methods. We follow closely the notation of Hockney & East- 
wood (1981) and Couchman (1999) to document the differences. 

With the mass density as an example we show, in a formal way, how its values on 
a 3D mesh are obtained. Using the 5-functional for the particle shape function, a 
particle distribution n(x) is given by 



where a is a summation index running over all particles, and x« = (x a , y a , z a ) is 
the position vector of particle a. As smoothing kernel we select here 



Ax denotes a vector of smoothing lengths (Ax, Ay, Az) in three coordinate direc- 
tions. W is a three dimensional generalisation of the standard top-hat function 



n(x) = £5 3 



(x - x a ) , 



(7) 



a 



^Ax) = n(^).n(^).n(^). 



fo, iei>i 



Hence the smoothed mass-density field is 



g(x) — m- W -k n = m 




(8) 
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where m denotes the particle mass, and the ^-operator a convolution as defined by 
the equation above. Mathematically we get from £>(x) a so-called mesh sampled 
functional $ (x) on a three dimensional mesh by 

<J(x) = III(x).0(x), (9) 
using the three-dimensional comb or sampling function defined as 



N 

IU(x)= 5 ( X - X ijk) ■ % - Vijk) ■ S(Z - Z ijk ) , 

i,j,k=0 

where i,j,k are indices of grid cells, whose centres are located at the vector points 
Xjjfc = {xijkiVijki z^k}. From the functional f/(x) a finite set of discrete mesh- 
sampled density values $ m = {@ijk,i,j,k = 0,1,..., AT} is obtained, which are 
defined at the centres of the grid cells. For each cell centre or mesh point they are 
computed by integrating $ over an arbitrarily shaped and sized volume containing 
just this and only this mesh point. The standard discrete FFT procedure (see above) 
is applied to the set of numbers $ m in order to determine the grid-based potential 
$ m = {&ijk, k — 0, 1, iV}. $ m is a set of numbers resulting from the FFT 
and represents the mesh sampled functional $t(x) = III(x)$(x), where $(x) is 
a true smooth gravitational potential. The set of values $ m is obtained from & 
by integrating over volumes containing single mesh points, just in the same way as 
explained above for g. We define the one-dimensional two-point difference operator 
for x direction in the standard way (Hockney & Eastwood 1981, p. 163), 



D x (x, y, z, Ax) = — (5(x + Ax) - 5(x - Ax))5(y)5(z) . 

To keep the notation clearer we remain for the moment in one dimension. A first 
order difference approximation to the x-component of the acceleration vector is 
given by 

m , . . ^ ^ . . $(rr + Ax, y, z) — $(x — Ax, y, z) ,,^ N 
a x ® (x, y, z, Ax) =D x *$(x) = -A L— ^ . (10) 

Sampling this on the mesh yields 



a^ix, y, z, Ax) = III(x) • a/\x, y, z) 

_ ttt/ x + Ax, y, z) - $(rr - Ax, y, z) 

~ W " 2Ax ■ { } 

If this is integrated over volumes containing a single mesh point, again exactly as 
described above for the cases of g and $, a set of accelerations a ijfc ® defined at the 
mesh point centres results: 
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here we have assumed that 2Ax = x i+ ij k — x i-i,jk is selected so as to match 
with the mesh point distances. In our notation the set of values {a Xji jk®, i, j, k = 
0,1,..., iV} is denoted by ct m x ®. The corresponding vector components for the ac- 
celeration in y and z direction are analogous. In the standard particle-mesh schemes, 
the accelerations between the mesh centres are obtained by using the same smooth- 
ing kernel as for the density, i.e. 

a x {1) (x, y, z, Ax) = W* ct x ® (x, y, z) = a Xiijk ® for x - x ijk < — 

and y - y ijk < — 

and z - z ijk < — ■ 

The use of the ^-functional for the particle shape and the top-hat function for 
smoothing gives the rather crude standard NGP scheme, where the acceleration 
on a particle is constant in each cell and has a discontinuity at the cell boundaries. 
A possible improvement regarding energy and momentum conservation is to use a 
so-called cloud-in-cell or CIC scheme, where the shape function of particles is a 
top-hat function, and the assignment function is a triangle function (see Hockney 
& Eastwood 1981). However, as one can also see in the cited book, such schemes 
suffer from force anisotropics, despite having very good conservation properties 
(i.e. the magnitude and direction of the force error depends on whether one goes 
parallel to the mesh coordinates or not). 

Therefore a non-standard scheme is applied in Superbox. It is NGP in nature, 
provides a very good energy and angular momentum conservation and a significant 
improvement for force anisotropics. Note that the rather involved mathematical for- 
malism above will later yield significant hindsight what is special with Superbox. 
The notation follows closely Hockney & Eastwood's and Couchman's. 
We begin with a Taylor expansion of the acceleration (for example the rr-compo- 
nent) centred on (x, y, z), the position of the centre of cell i, j, k: 



a x ®(x + dx,y + dy, z + dz) — aj 2 \x, y, z) + 

da x da x da x 

— {x,y,z)-dx + —(x,y,z)-dy + —{x,y,z)-dz (13) 

+ 0(dx 2 ), 

where 0(dx 2 ) denotes any higher order terms in dx, dy, dz or combinations thereof. 
The displacements dx, dy and dz should not be linked or confused with the mesh 
spacings Ax, Ay, Az. They are free and should provide an interpolated expression 
for a x at any point inside a given mesh cell, e.g. a particle's position. We then use 
the difference approximations 
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a x ®(x,y,z, Ax) = ^(x,y,z, Ax) = D x -k$ , 
da r/ . . . , <9 2 $ 



(x, y, z, Ax, Ay, Az) = ■— (x, y, z, Ax, Ay, Az) = D xx -k$ , 



OX ox z 



x, y, z, Ax, Ay, Az) = ■^-(x, y, z, Ax, Ay, Az) = D xy *$ , 



ay oxoy 
da <9 2 $ 

-j-{x, y, z, Ax, Ay, Az) = t^(x, y, z, Ax, Ay, Az) = D xz -k $ . (14) 

We may now generate a mesh sampled d x ® = IIIa x ® in the standard way and 
integrate this over any volume containing one grid cell centre to obtain a set of 
acceleration values a\ n x ® given by the following expressions: 



a ijk ,®{dx, dy, dz) = < "'-'^ A '"' ^ (15) 

&i+l,jk + &i-l,jk — 2 • $jjk 

+ (Ax) 2 ' dX 

$m,j+i,k - ®i-i,j+i,k + $ir-i,j-i,k - ®m,j-i,k _ , 
AAxAy " ~ ' ' V 

AAxAz 

Note that we generally take Ax = Ay = Az = 1 i.e. the cell-length is assumed to 
be equal along the three axes and unity. The accelerations in y- and ^-direction are 
calculated analogously. 

At first glance the interpolation scheme Eq. 15 for the acceleration seems to be 
closely related to the procedure used in a standard cloud-in-cell (CIC) scheme. In 
one dimension, CIC and Superb ox methods can be more easily compared to each 
other. For < dx < Ax/ 2 one has 



/ t x Ax — dx dx 

CIC : a(x + dx) = ai ■ + a i+ i ■ — — 

Ax Ax 

2Ax + 2(A:r) 2 ' 

da 



Superbox : a(x + dx)=ai + 

dx 



■ dx 



2Ax + (Ax) 2 ' 

where i denotes the cell-index. Clearly this is different, because CIC uses accelera- 
tions in two neighbouring cells, while Superbox takes acceleration and its deriva- 
tive only in one cell. Therefore we propose to consider Superbox as an NGP type 
scheme, although a second order interpolated acceleration is used, because nowhere 
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the acceleration of the neighbouring cell is entering into the equations. Though 
of NGP nature, a x in Superbox behaves steadily when crossing cell boundaries 
along the rr-axis (and a y , a z correspondingly on their respective axes). In contrast to 
CIC our scheme is not steady in any other direction. However, the errors induced by 
that are much smaller than in a standard NGP scheme due to the above higher order 
interpolation, and are perfectly tolerable as will be shown throughout this paper. 

The salient feature of Superbox regarding the force anisotropics can be under- 
stood by considering the force at a distance r = |dr| from a particle located at the 
centre of a cell. To lowest order the Superbox force-error e is then 



where dr = (dx, dy, dz) is the displacement vector from the cell centre, and we 
have neglected all terms of second order in Ax, Ay, or Az in a Taylor series ex- 
pansion of the potential differences. In other words to lowest order the Superbox 
force error points to the exact centre of the cell. Thus, compared with standard NGP 
schemes, we get a very precise force-calculation in Superbox, as shown in Figs. 2 
and 3. 




Fig. 2. Pairwise acceleration calculated by Superbox (using i^ooo = 1) f° r one parti- 
cle fixed in the centre and the second moving along the x-axis. The agreement between 
the acceleration calculated by Superbox (solid line) and the analytical (i.e. Keplerian) 
force (dashed line) is very good, if the two particles are at least 2 cells apart. Upper panel: 
pairwise acceleration over the entire spatial range. Lower panel: blown up pictures for the 
innermost part (left) and the grid-boundary (right). 
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Fig. 3. As Fig. 2, except that here the second particle moves along a diagonal with 

y = x,z = 0,r = \J x 2 + y 2 . 

The orbits of the particles are integrated forward in time using the leapfrog scheme. 
For example, for the re-components of the velocity, v x , and position, x, vectors of 
particle 

<T V2 = <7 1/2 + < r At (17) 

*r +1 =*r+<T 1/2 -A*, 

where n denotes the nth time-step and At is the length of the integration step. 



2.3 The grids 



For each galaxy, 5 grids with 3 different resolutions are used. This is made possible 
by invoking the additivity of the potential (Fig. 4). 

The five grids are as follows: 

• Grid 1 is the high-resolution grid which resolves the centre of the galaxy. It has 
a length of 2 x -R cor e in one dimension. In evaluating the densities, all particles 
of the galaxy within r < R cove are stored in this grid. 

• Grid 2 has an intermediate resolution to resolve the galaxy as a whole. The length 
is 2 x -R ut> but only particles with r < R COIC are stored here, i.e. the same 
particles as are also stored in grid 1 . 
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Grid 1 


, - s 


\ 
\ 
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N ^ Rcore^ 








Grid 1 



Grid 3 




Grid 4 



Grid 5 



Fig. 4. The five grids of SUPERBOX. In each panel, solid lines highlight the relevant 
grid. Particles are counted in the shaded areas of the grids. The lengths of the arrows are 
(N/2) — 2 grid-cells (see text). In the bottom left panel, the grids of a hypothetical second 
galaxy are also shown as dotted lines. 

• Grid 3 has the same size and resolution as grid 2, but it only contains particles 



with R r 



< r < R, 



out- 



• Grid 4 has the size of the whole simulation area (i.e. 'local universe' with 2 x 
-Rsystem), and has the lowest resolution. It is fixed. Only particles of the galaxy 
with r < R out are stored in grid 4. 

• Grid 5 has the same size and resolution as grid 4. This grid treats the escaping 
particles of a galaxy, and contains all particles with r > R out . 

Grids 1 to 3 are focused on a common centre of the galaxy and move with it through 
the 'local universe', as detailed below. All grids have the same number of cells per 
dimension, N, for all galaxies. The boundary condition, requiring two empty cells 
with q = at each boundary, is open and non-periodic, thus providing an isolated 
system. This however means that only N — 4 active cells per dimension are used. 
To keep the storage requirement low, all galaxies are treated consecutively in the 
same grid-arrays, whereby the particles belonging to different galaxies can have 
different masses. Each of the 5 grids has its associated potential % = 1, 2, 5 
computed by the PM technique from the particles of one galaxy located as de- 
scribed above. The accelerations are obtained additively from the 5 potentials of 
each galaxy in turn in the following way: 



$(r) = [9(R core - r) • $i + 9{r - R COIC ) ■ $ 2 + $ 3 ] • ^out - r) (18) 
+ 9(r-R out ) -$ 4 + $ 5 , 

$0Rcorc)=$l + $3 + $5 
®(R out ) = $2 + $3 + $5 
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where = 1 for £ > and 6(g) = otherwise. This means: 



• For a particle in the range r < -R cor c, the potentials of grids 1, 3 and 5 are used 
to calculate the acceleration. 

• For a particle with R com < r < R out , the potentials of grids 2, 3 and 5 are 
combined. 

• And finally, if r > R out then the acceleration is calculated from the potentials of 
grids 4 and 5. 

• A particle with r > R systcm is removed from the computation. 

Due to the additivity of the potential (and hence its derivatives, the accelerations) 
the velocity changes originating from the potentials of each of the galaxies can be 
separately updated and accumulated in the first of the leap-frog formulas Eq. 17. 
The final result does not depend on the order by which the galaxies are taken into 
account and it could be computed even in parallel, if a final accumulation takes 
place. After all velocity changes have been applied in all galaxies, the positions of 
the particles are finally updated (see Fig. 1). 

As long as the galaxies are well separated, they feel only the low-resolution poten- 
tials of the outer grids. But as the galaxies approach each other their high-resolution 
grids overlap, leading to a high-resolution force calculation during the interaction. 

2.3.1 Grid tracking 

Two alternative schemes to position and track the inner and middle grids can be 
used. The most useful scheme centres the grids on the density maximum of each 
galaxy at each step. The position of the density maximum is found by constructing 
a sphere of neighbours centred on the densest region, in which the centre of mass 
is computed. This is performed iteratively. The other option is to centre the grids 
during run-time on the position of the centre of mass of each galaxy using all its 
particles remaining in the computation. 

2.3.2 Edge-effects 

It is shown in Fig. 4 that only spherical regions of the cubic grids contain particles 
(except for Grid 5). Particles with eccentric orbits can cross the border of two grids, 
thus being subject to forces resolved differently. No interpolation of the forces is 
done at the grid-boundaries. This keeps the code fast and slim, but the grid-sizes 
have to be chosen properly in advance to minimise the boundary discontinuities. 
As shown in Section 4, this leads to some additional but negligible relaxation- 
effects, because the derived total potential has insignificant discontinuities at the 
grid-boundaries (Wassmer 1992). The best way to avoid these edge-effects is to 
place the grid-boundaries at 'places' where the slope of the potential is not steep. 
Fig. 5 shows that, as long as the grid-sizes are chosen properly, there are no serious 
effects at the grid-boundaries. Only in the example shown in the bottom-right panel 
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the innermost grid was chosen too small, leading to a low particle number per grid- 
cell. Performing such computations with H 000 = 4/3 leads to spurious results, as 
explained in Section 2.1. Nevertheless the profile is smooth at the grid-boundary. 




Fig. 5. The radial density distribution of Plummer spheres at the grid-boundary between 
the innermost and the middle grid (shown as dotted vertical lines) after computing for 
10 half-mass crossing times for different grid-sizes. The radial density profile resulting 
from the Superbox computation, always with N p = 5 x 10 4 , is shown as dots with 
error-bars. The analytical density law is shown as the solid line. In all cases except the 
bottom left panel R out = 9R P \ (where R p \ is the Plummer radius). Upper left: N = 32 
grids per dimension, i? C orc = 2i? p i. Upper right: N = 64, R CO re = 2i? p i. Middle left: 
N = 32, R COTC = lR ph Middle right: N = 32, R corc = AR pl . Bottom left: N = 32, 
-Rcore = 2i?pi and R out = 14i? p i . Bottom right: N = 32, i? cor e = 0.5i? p i. Only if the 
innermost grid is chosen too small (bottom right) are there significant deviations from the 
analytical profile. For this calculations Hqqq = 4/3 is used. 



2.3.3 Model units 

Superbox employs model units, with the gravitational constant in model units 
being G mod = 2 owing to historical reasons. A flag specifies if the user wants to 
input and output data in physical units (i.e. kpc, M , km/s, Myr). Scaling to and 
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from physical units is achieved via the formulae: 



phys\ _ Gphys Mphys / ^phys \ 
mod / G^mod ^mod V -^mod ' 



.L moc [/ m od -^mod V J- mod 

^phys -^phys -^mod 



^mod ^~phys ^mod 

where L is the length, M the mass, T the time, V the velocity and G the gravita- 
tional constant. The model length unit is taken to be L mod = 1 for the length of 
-Rcorc.i of galaxy 1. The corresponding physical length scale, L p h ys , is taken from 
the input data that specifies the length of this grid. The physical mass of the first 
galaxy, M phys l , is taken to be unity in model units, M mod l = I. The integration 
step-size, T phy s, is specified by the user. The step-size in model units, T mod , follows 
from Eq. 19. 

Converting factors for each grid of all galaxies are calculated to get the forces and 
accelerations, due to the FFT and the differentiation of the potentials requiring a 
cell-length of unity. 



3 Conservation of Energy and Angular Momentum 



In order to check how well energy and angular momentum are conserved, calcu- 
lations with isolated Plummer- spheres in virial equilibrium with different parti- 
cle numbers, time-steps and grid-resolutions are presented here. To have a non- 
vanishing angular momentum vector in ^-direction for the isolated galaxy model, 
all particle orbits are taken to have the same direction in the rry-plane. This pro- 
cedure does not change any other properties of the Plummer-model, but makes it 
possible to check for relative changes in angular momentum. 



3.1 Total Energy 



Calculating the correct potential energy of a stellar system requires summing all 
two-body interactions between all particles. This is quite impossible for iV p > 10 6 . 
By using the available grid-based approximation and adding up only the mean po- 
tential values of the cells for each particle, a useful estimate of the total potential 
energy is obtained. If, on the other hand, the deviation from the centre of the cell 
and contributions from the neighbouring cells were taken into account to obtain 
a more accurate estimate, then CPU-time would have to be used for an arguably 
purposeless endeavour. For simplicity, in Superbox the total potential energy is 
computed without any interpolation inside grid cells (as it was used for the accel- 
eration); only the potential value of the particle's cells are taken into account to 
compute the potential energy. The total potential energy at the nth time-step is thus 
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1 ^p 

E; ot *--'£m i <S> n (x i ,y i ,z i ), (20) 
1 i=i 

where $ n is the sum of the potentials of all grids and galaxies used to calculate 
the force on particle i at the n-th time-step, or in other words, all potentials which 
contribute to the leap-frog update of the velocity in Eq. 17. rrii is the mass of the 
particle (equal for all particles belonging to the same galaxy). 

The total kinetic energy is also calculated only approximately during runtime. This 
comes about because in the leapfrog integration-scheme the velocities are half a 
time-step behind the positions. Within our integration scheme the time-centred in- 
terpolation for the kinetic energy would be 

1 ^p 3 / n+1/2 ™-l/2\ 2 

X>*E ( j , (2D 

where vfj is the j -component of the velocity vector of particle i at time- step n. In 
Superbox the philosophy is to keep the amount of storage as small as possible, 
so that only the current velocities are stored. To keep the error in E^ in as small 
as possible, particle kinetic energies are calculated before and after the velocity 
vectors are updated in one time-step. From these two values the arithmetic average 
is taken: 

1 3 

K„ ~ t E m *E( «7 1/2 ) 2 + «, +1/2 ) 2 ) • ( 22 ) 

4 i=l j=l 

This still implies an additional error of the order of (At> ) 2 , but keeps storage low 
and the code fast. 

To quantify the performance of Superbox in terms of conservation of total energy, 
E to t(t) = E pot (t) + E kin (t), calculations with different time-steps, particle number 
and grid resolution are done. We consider the relative change AE tot (t) / E tot (to), 
where AE tot (t) = E tot (t) — E tot (t ), and t is the reference time. It is chosen to be 
zero when the adjustment to equilibrium after setting up the discrete rendition of 
the galaxy is accomplished. Max(A_E tot (t)/£' tot (0)) denotes the largest deviation 
in energy that occurs during a computation (see Fig. 6). 

Table 1 shows that deviations in E tot (t) are less than 0.5 per cent, as long as the 
time-step, At < T cr /10, where T cr is the half-mass crossing-time of the Plummer- 
model. We consider At < T cr /50 to be a safe choice. In theory (see Hockney & 
Eastwood 1981) the error of the leapfrog integrator should decrease as (At) 2 for 
At — > 0, but as is evident from Table 1, other error sources become prominent at a 
sufficiently small time step. 
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Table 1 

Maximum deviation in total energy, max(A£ , to t/-Etot) after 10T cr for different time-steps 
using iVp = 10 5 particles and TV = 32 grids per dimension. 



At (time-step) 


max(A£ tot (i)/£ tot (0)) 


T cr /10 


0.4 % 


T cr /100 


0.2% 


T cr /1000 


0.2% 



0.02 



0.01 



Ed 
< 



-0.01 - 



-0.02 

max(AE/E) 
-0.03 



i | | | | ii 

data points 

linear regression 

slope -0.002 % / T cr _ 




8 10 



Fig. 6. Linear drift of the total energy AE to t{t) / E tot (0) in per cent for N p = 10 6 particles 
and N = 64 grids per dimension using At = T cr /100 (the regression line is shown with a 
little offset for clarity). 

Fig. 6 shows the time evolution of the change in total energy for a particular calcu- 
lation. A linear drift fits the data well, which holds true in computations extending 
over 50 crossing-times. The slope of the energy-drift is 0.002 per cent per crossing- 
time. The dispersion in energy is artificial due to the crude calculation of potential 
energy. Over a Hubble time, which corresponds to 150 x T cr if the Plummer model 
galaxy is assumed to have T cr = 10 8 yr, the resulting energy error is 0.3 per cent. 
This linear drift is intrinsic to a PM code (see Hockney & Eastwood, their fig. 9.4), 
and its strength depends mainly on the choice of H 000 in the Green's function. H 000 
determines the strength of the gravitational interaction of particles in the same cell, 
and also the self-gravitation of a particle (see Section 2.1). Additional errors arise 
through the limited grid-resolution and the approximations adopted in evaluating 
the energies. This gives an additional oscillation around the value of the linear drift 
seen in Fig. 6. 

Increasing the number of particles per galaxy to N p > 10 5 leads to little further 
improvement in max(A£ , to t/-E'tot) with N = 32. The decrease of the error levels 
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Fig. 7. Maximum deviation in energy, ma\(AE tot (t) / E tot (0)) in per cent, after 10 T cr , for 
different particle numbers and grid-resolutions, using At = T cr /100. 

off if the mean number of particles per cell becomes sufficiently large. However, 
changing the grid resolution from N = 32 to 64 cells per dimension gives an 
improvement by a factor of at least 4, with the decrease in error still progressing 
for N p > 10 6 (Fig. 7). This is due to the estimate of the true potential by our 
grid-based potential, and thus of the potential energy, improving with increasing 
N. It also shows that increasing the number of grid-cells while keeping the number 
of particles low (with H 000 = 4/3) does not improve energy conservation (e.g. 
N = 64, iV p = 10 4 ). 



3.2 Angular Momentum 



To calculate the absolute value of the total angular momentum of the galaxy, L tot , 
the same technique as for the kinetic energy is applied. That is, the arithmetic mean 
of the two velocity values (before and after the velocity update) is calculated for 
each velocity component to obtain an estimate of the velocity vector that is in-phase 
with the position vector. 

The time evolution of AL tot (t) / L tot (t) for one particular calculation is shown in 
Fig. 8. As with E tot , computations over 50T cr show insignificant deviations from 
the linear drift. The slope of the drift is about 0.003 per cent per crossing-time. 

Conservation of L tot does not depend on At as long as the time- step is small enough 
( At < T cr /50 as for energy in Section 3.1). But it is highly dependent on the grid- 
resolution. Changing from N = 32 to 64 grids per dimension improves the change 
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Fig. 8. The change in AL tot (t) / L tot (0) in per cent for N p = 10 6 particles and N = 64 
grids per dimension using At = T cr /100 (the regression is shown offset). 

in angular momentum by at least a factor of 10. From Fig. 9 it can be seen that 
max(AL to t/^tot) is quite independent of N p . 
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Fig. 9. Maximum deviation in angular momentum, max(AL tot (t)/L to t(0)) in per cent, 
after 10T cr , for different particle numbers and grid-resolutions, using At = T cr /100. 
max(ALtot(i)/£tot(0)) is defined analogously to max(A£ , to t(i)/£ ; tot(0)) in Section 3.1. 
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4 Relaxation 



Galaxies have two-body relaxation times of the order of 10 7 Gyr, and show little or 
no relaxation over a Hubble time even in their inner parts. Therefore, a programme 
which models galaxies has to be collision-less. Since no particle-mesh code can 
be entirely free of relaxation, we have to check on which time-scales Superbox 
provides reliable results. 

Following Standish & Aksnes (1969), the following experiment is performed: A 
number of equal-mass and fixed particles is distributed homogeneously inside a 
sphere of radius i? sph with a constant density distribution. A second group of parti- 
cles is distributed on the surface of this sphere. They are allowed to move through 
the centre and leave the sphere on the opposite side. To make sure they leave the 
sphere, they are given an initial non-zero radial velocity component towards the 
centre of the sphere. The points where the moving particles leave the sphere are 
noted and the calculation is stopped after all moving particles have left the sphere. 
For every particle, its deflection angle a is obtained from 



Rt Ri 



cos a = 



\Ri\ \Rf\ 



(23) 



where Ri and R F are the vectors from the centre of the sphere to the initial and 
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Fig. 10. The deflection angle a. 

final position of the particle (see Fig. 10). The mean deflection angle, a, of all 
particles is computed and the relaxation time, T rcl , is calculated from 



nn sin 90- 

T rc i = — — T cr . (24) 
sin a 

Here T cr is the mean time the moving particles require to travel through the sphere. 
The number of moving particles is chosen to be 10 5 , which gives a sufficiently 
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accurate estimate for T re \. The number of fixed particles, A^ x , and the size of the 
inner mesh, -R core , are varied. Grids with 32 grid-cells per dimension are used. For 
a certain combination of iV fix and R COI c, 100 calculations with different random 
number seeds (both for the fixed and moving particles) are performed. In all cases, 
-Rsph < -Rout (see also Fig. 4). 




10 4 10 5 10 6 

Fig. 11. The ratio T ie \/T cr as a function of the number of fixed particles. 

Fig. 1 1 shows the ratio T rel /T cr as a function of iV fix . If one grid contains the whole 
sphere (i? CO rc = l-2-R sp h) a linear dependence of T re \/T CT on iV nx fits the results 
very well. Such a dependency is predicted for large particle numbers by standard 
relaxation theory (Spitzer & Hart 1971; Binney & Tremaine 1987). For iV fix > 10 5 , 
T rc \/T CT > 10 3 . Since T cr ps 10 s yr for a typical galaxy, the relaxation time is one 
order of magnitude larger than a Hubble time. Hence galaxies can be modelled with 
Superb ox provided N p > 10 5 particles per galaxy. 

These results are not substantially changed if the particles have to cross a grid 
boundary. As an example, the case -R cor e = 0.5 R sph is shown in Fig. 1 1 . Compared 
to the one-grid case, T rel is reduced by about 20 per cent nearly independently of 
iV fix . Hence, grid boundaries do not decrease T rcl significantly. The relaxation time 
also drops if the number of grid-cells is increased: Due to the better resolution, 
forces from particles in adjacent cells become larger, increasing the relaxation. As 
an example the case with 128 cells per dimension is shown in Fig. 11. The re- 
laxation times are a factor of 2.3 smaller compared to the case with A" = 32. 
Assuming that all particles within a cell are located at the cell centre, and treating 
the deflection of the moving particles as a pure N-body problem, we may obtain 
an analytical estimate of the relaxation in Superb ox. A consideration similar to 
the one described in Binney & Tremaine (1987) p. 187 ff shows that the relaxation 
time should depend in the following way on N &x and N gc : 



21 



TreX = K- ^ -T cr (25) 

with 7 = 1/3. From the experiments described in this section we obtain a value for 
the constant of proportionality k = 0.05. Together with this value, Eq. 25 gives an 
estimate of the relaxation time of Superb ox. From this result we can quantify, for 
a given grid resolution and computational time, the minimum number of particles 
that should be used to exclude any unwanted relaxation effects. It is again obvi- 
ous that calculations with a high number of grid-cells should be done only with a 
sufficiently large particle number to ensure T ve \ > T Hu bbic- 



5 Memory and CPU- Time Requirements 



5. 1 Memory 



The memory requirement of Superb ox scales both linearly with the particle num- 
ber, iVp, and with the number of grid-cells, iVg C = N 3 . For the particle array, 24 byte 
per particle (4 bytes per phase- space variable) are needed. The total amount of 
memory for different particle numbers and grid-resolutions is shown in Fig. 12. 




Fig. 12. The total memory requirement of Superbox for different grid-resolutions as func- 
tion of the particle-number N p (straight line is the memory requirement of the particle array 
alone). 
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The memory required for a Superbox calculation can be estimated from 

memory grid = (5 • TV 3 + N(2N) 2 + (N + l) 3 ) x 4 byte, (26) 
memoryparticics = (8 • N p + overhead) x 4 byte. (27) 

The first term on the right-hand side of Eq. 26 comes from the 5 grids (Section 2.3). 
The second term comes from the array needed for the FFT, and the last term comes 
from the array required to store the Fourier-transformed Green's function. The 
memory usage of the grids is independent of the number of galaxies, since they 
are treated consecutively in the same grid-arrays. The 'overhead' contains the ar- 
rays of the Superbox parameters (e.g. grid sizes), the centre of mass and density, 
and output data such as energies, angular momenta and Lagrangian radii, for each 
galaxy. Increasing the number of galaxies, while keeping the total particle and grid 
number constant, reduces the relative overhead contribution (Eq. 27) slightly due 
to these large arrays scaling only with the number of particles per galaxy. 



5.2 CPU-time 



The time-step cycle of Superbox is divided into three main routines (Fig. 1). 
Firstly, in Getrho the mass density of the grids is calculated. Secondly, the FFT 
routine computes the potential on the grids. Thirdly, the Pusher routine contains 
the force calculation, the position and velocity updating, and collection of the out- 
put data. These three routines need about 99 per cent of the total CPU time. Fig. 13 
shows the amount of CPU-time per step, £cpu> needed for the different routines. 
All data are derived on a Pentium II 400MHz processor with Linux as the operat- 
ing system and the GNU-g77 Fortran Compiler (with options —03 — m486) for a 
model single galaxy. For the Getrho and Pusher routines, t C pu oc iVp, while 
for the FFT routine, t CPU oc N gc log 10 N gc (N gc = N 3 ; Fig. 14), which is by far the 
dominant contribution. The resultant CPU time per step for galaxy % is: 



tcpu,i = aN p4 + [3x5 N gc \og w N gc . (28) 

As another example, for a Pentium 200MHz MMX, t C p\j m 20 sec/step for N p = 
10 6 particles in one galaxy and N gc = 64 3 grids. 

The CPU-time needed for a computation with iV ga i galaxies is derived by adding 
the individual times, 



gal 



t 



CPU.tot 



(29) 



23 



lOO 
50 



lO - 



prm 1 — i i 1 1 1 mi 1 — i i 1 1 mi 1 — i i 1 1 iiii 



5 - 



ex 

CD 



CD 1 

<D 0.5 
S 



0.1 
0.05 



T 

32— grid 




Q O 1 I ""I I I L 



I I 



a, 

a> 



o 

CD 



lOO 

50 



lO r 

5 - 



1 = 



co 0.5 
6 



0. 1 
0.05 




o o 1 1 ""^ 1 I 1 I i 

lO 4 lO 5 lO 8 lO 
number of particles 



lO 4 lO 5 lO 6 10 7 
number of particles 



Fig. 13. Contributions to the CPU time, tcPU> by different routines, and total CPU-time per 
step for different particle numbers and grid resolutions (r denotes the GETRHO-routine, f 
the FFT-routine and p the PuSHER-routine). 
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Fig. 14. CPU-time per step used by the FFT-routine for different grid-resolutions (solid 
line). The dashed line shows t oc N gc = N 3 . 
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6 Comparison with other Codes 



Demonstration of the conservation of energy and angular momentum is a neces- 
sary but not sufficient condition for validating any particle-mesh code. Additional 
validation of the code comes from an inter-comparison of results obtained with en- 
tirely different numerical techniques. Such a comparison is available in the study 
of the tidal dissolution of small satellite galaxies orbiting in an extended Galactic 
dark halo (Klessen & Kroupa 1998). Such an application of Superbox is rather 
extreme in that the jump in resolution from the middle grid (0.29 kpc/cell length) 
containing the small satellite to the outermost grid (25 kpc/cell length) containing 
the 'local universe' is very large. 

In that study, two Superb ox computations are compared with calculations done 
using a direct but softened N-body integrator on the special hardware device named 
Grape 3 connected with a SUN-Ultra 20. The Superbox calculation uses N gc = 
32 3 grids, has 3 x 10 5 satellite particles, and treats the live Galactic dark halo with 
10 6 particles. The Grape computation uses 1.3 x 10 5 satellite particles with the 
Galactic dark halo being an analytic isothermal potential. Both calculations proceed 
for many thousands of time steps, corresponding to a time interval of many Gyr, 
until beyond disruption of the satellite. The results are in nice agreement, yielding 
essentially the same behaviour of the satellite, apart from expected (small) dif- 
ferences in exact disruption time. Of interest is also a comparison in CPU times 
required for the calculations, keeping in mind the very different number of parti- 
cles used: with Superbox it takes 0.57 days on an IBM RISC/6000 workstation to 
compute 1 Gyr using 1.3 x 10 6 particles, and it takes 0.36 days with Grape using 
1.3 x 10 5 particles (with an 0(N%) scaling). 

In the above calculations, the satellite mass is too small for the orbit to be affected 
by dynamical friction. A comparison of the orbital decay through dynamical fric- 
tion of a massive satellite is of interest with analytical estimates, because this allows 
an altogether independent verification if Superbox treats collective behaviour on 
larger scales correctly. This is discussed in Section 7. 



7 Dynamical Friction 

An object moving through a homogeneous distribution of much lighter masses in- 
duces a trailing over-density. This over-density attracts the object which, as a result, 
is decelerated (Chandrasekhar 1943). 

The orbital decay of a satellite galaxy or globular cluster orbiting within a larger 
galaxy is an astrophysical process relevant, for example, to the rate at which a satel- 
lite population is depleted and to the built-up of galactic bulges. In the Local Group, 
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dynamical friction affects the orbit of the Magellanic Clouds, which is important 
for tracing their origin (see Westerlund 1997 and Kroupa & Bastian 1997 for sum- 
maries). It may also change the orbit of the Sagittarius dwarf spheroidal galaxy 
(Ibata & Lewis 1998; Gomez-Flechoso et al. 1999) and other nearby companions, 
depending on their masses. Understanding and proper application of dynamical 
friction is thus a very important issue. 

The question if Chandrasekhar's formula (Eq. 30 below) can be applied to orbits 
within finite and inhomogeneous density distributions has been the issue of a sig- 
nificant debate throughout the 1970's and the 1980's. The result is that for satel- 
lite masses that are relatively small compared to the large galaxy, the formula is 
a good approximation (Bontekoe & van Albada 1987; Zaritsky & White 1988; 
Velazquez & White 1999). Massive satellites induce non-local perturbations in the 
larger galaxy that are not taken account of in the derivation of Eq. 30, and in the col- 
lision of two galaxies of comparable mass analytical estimates become intractable, 
and the self-consistent numerical experiment must be resorted to (e.g. Madejski & 
Bien 1993). In order to do this we must, however, have reason to trust the numerical 
results. 

To demonstrate the reliability of our results, we compare the orbital evolution of 
a satellite galaxy orbiting in an extended dark halo computed with the fully self- 
consistent SuPERBOX-code, with the evolution resulting from the by now well- 
established Chandrasekhar approximation (Eq. 30). 

7.1 Calculations with Superb ox 

For the numerical experiment a compact spherical satellite galaxy is injected into 
an extended isothermal dark parent halo. The numerical rendition of both systems 
is described in detail in Kroupa (1997), and only a short description is given here. 

The parent halo is assumed to extend to R c = 250 kpc with a core radius 7 = 5 kpc 
and a total mass M ha i = 2.85 x 10 12 M , corresponding to a circular velocity 
V c = 220 km/s. A particle takes 588 Myr to cross the 33 per cent mass diameter. 
The inner, middle and outer grids are centred on the density maximum and have 
edge lengths of 50, 188 and 700 kpc, respectively. The model is allowed to relax 
to virial equilibrium, after which state it has a slightly more compact configuration 
with a circular velocity at a radius of 150 kpc of 245km/s, corresponding to a mass 
within that radius of about 2.1 x 10 12 M Q . Due to a mild radial orbit instability the 
halo is also slightly prolate. 

Plummer density distributions are taken for the satellites, each with a Plummer 
radius R p \ = 3 kpc and a cutoff radius of 15 kpc. The innermost and middle grids 
are centred on the satellite's density maximum and have edge lengths of 10 and 
40 kpc, respectively. The outer grid is the same as for the halo. Three satellite 
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masses are used, M sat = 1 x 10 10 , 7x 10 10 and 5 x 10 11 M . The respective crossing 
times of the 33 per cent mass diameter are t cri33 = 84, 32 and 12 Myr. 

For the calculation, N = 32 grid cells per dimension are used (only 28 being active, 
see Section 2.3), with 10 6 particles in the parent halo, and 3 x 10 5 satellite particles. 
For the halo the resolution is thus 1 .79 kpc/cell-length for r < 25 kpc, 6.7 1 kpc/cell- 
length for 25 kpc < r < 94 kpc and 25 kpc/cell-length for 94 kpc < r < 350 kpc. 
For the satellite the resolution is 0.36 kpc/cell-length for r s < 5 kpc, 1.43 kpc/cell- 
length for 5 kpc < r s < 20 kpc and 25 kpc/cell-length for 20 kpc < r s < 350 kpc, 
where r s is the distance from the satellite's density maximum. The integration step 
size is At = t CT:33 /75. 

The satellite is allowed to relax to virial equilibrium before being placed into the 
parent halo at an initial radius r(0) = \r(t = 0)| = 150 kpc with an initial velocity 
v(0) = \v(t = 0)| = 245 km/s perpendicular to the radius vector, r. 

The satellite's galactocentric distance is r(t) = |r sat (t) — r ha io(t)|, and its velocity 
is v(t) = |v sat (£) - Vhaio(0l> where r sat (t) and r halo (t) are the position vectors of 
the density maxima of the satellite and halo, respectively, and v sat (t) and v halo (t) 
are the velocity vectors of the satellite's and halo's density maxima, respectively. 



7.2 Chandrcisekhar friction 



The equation of motion of a satellite in a stationary and rigid isothermal parent halo 
is 



dt 2 



a g = 



V 2 

v c 

^y2 _|_ y*2 ' 



r ? = -4vrG 2 lnA 



p(r)M 



sat 



erf (x) 



2xe x 



(30) 
(3D 
(32) 



where a g is the acceleration in an isothermal halo, and rj is the deceleration due 
to dynamical friction, in which G = 4.499 x 10~ 3 pc 3 / (M Myr 2 ) is the gravita- 
tional constant, p(r) = V 2 / [AnG (■y 2 + r 2 )] is the halo density and erf(x) is the 
error function with x — v/V c . A derivation of Chandrasekhar's dynamical friction 
formula can be found in Binney & Tremaine (1987). 



The numerical value of the Coulomb logarithm, InA = b max /b m i n , is somewhat ill- 
defined. It is calculated by integrating particle deflections over impact parameters 
ranging from a minimum (b min ) to a maximum (6 max ) effective value. The mini- 
mum impact parameter is taken here to be b m - m = 2.7 kpc. The maximum impact 
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parameter, however, is something like the distance over which p(r) falls off signifi- 
cantly, and is thus less-well defined. An upper limit is b max ~ r(t = 0) = 150 kpc. 
It can also be estimated by assuming p(r + & max ) = 0.5 p(r). With r = 150 kpc, 
5 max = 62 kpc. If, on the other hand, p(r — & max ) = 2 p(r), then with r = 150 kpc, 
fo max = 44 kpc. Once the satellite orbit decays to r = 30 kpc, then 6 max = 13 kpc, 
while 6 max = 9 kpc. The Coulomb logarithm thus lies in the range InA ps 1 to 4, and 
the above argument suggests that it may be a monotonically decreasing function of 
declining r(t). 

Apart from the somewhat arbitrary value of the Coulomb logarithm, the analytic 
derivation of Eq. 32 assumes a Maxwellian velocity distribution in the halo, ne- 
glects the self gravity of the wake induced by the satellite's gravitational focusing, 
and the motion of the deflected halo particles is assumed to be governed only by 
the satellite - the gravitational field of the parent halo is neglected. Nevertheless, 
Eq. 32 has been shown to provide reliable results provided M sat /M halo < 0.2 and 
7 < r(t) < R c (Binney & Tremaine 1987 and references therein). The Superbox 
model described in Section 7.1 conforms with these restrictions. 

For the comparison with Superbox, V c = 245 km/s, 7 = 3 kpc, and M sat = 
1 x 10 10 , 7 x 10 10 and 5 x 10 11 M . Eq. 30 is rewritten to four coupled first-order 
differential equations, and integrated numerically using the Runga-Kutta method, 
with a sufficiently small step size to ensure stability of the solution. 



7.3 Results 



Since the Coulomb logarithm is ill defined, it is necessary to first calibrate the 
numerical solution to Eq. 30 using one Superbox calculation. Fig. 15 shows the 
decay of the orbit of the satellite with M sat = 7 x 10 10 M & according to Superbox 
and Eq. 30 under different assumptions for InA. 

For the solution shown as the lower-dotted-r(t) curve in Fig. 15, InA = 4. The 
Superbox result requires a smaller Coulomb logarithm. Tests show that InA = 1.6 
leads to good agreement. To show a possible upper limit to the solution of the 
equation of motion, the case InA = (0.049 r(t)/b min ) is plotted as the upper-dotted- 
r(t) curve. For this case, InA = 1 initially, and it decreases with r(t). The figure 
also shows the time-varying distance along the x-axis between the density maxima 
of the satellite and the halo. The damped oscillation is nicely evident, and the period 
of the orbit is initially 3.2 Gyr. The lower panel of Fig. 15 plots the solutions for 
v (t). The satellite in the Superbox calculation retains an approximately constant 
velocity, as is expected from the theory of dynamical friction, the solutions from 
which are plotted with different lines corresponding to the cases discussed above. 

The overall conclusion is that the Superbox result is consistent with theoretical 
expectations, provided InA =1.6. That this finding extends to other satellite masses 
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is shown in Fig. 16. In all cases, the satellites have lost less than 20 per cent of their 
mass by the end of the calculation. 




Fig. 15. The time evolution of the satellite's galactocentric distance, r (upper panel), 
and velocity, v (lower panel). The satellite has a mass M sat = 7 x 10 10 M and or- 
bits in an extended isothermal dark halo. The Superb ox result is shown as the thick 
solid curve. The other (theoretical) curves result from numerically integrating Eq. 30 as- 
suming In A = 4 (lower dotted curve in top panel), In A = 1.6 (long-dashed curve) and 
lnA(t) = ln(0.049 r(t)/b m - m ) with 6 m i n = 2.7 kpc (upper dotted curve in top panel). The 
SUPERBOX evolution of x(t) = x sat (t) — Xh a io(0 is plotted as the thin dotted line in the 
upper panel. 
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Fig. 16. The time evolution of the satellite's galactocentric distance, r (upper panel), and 
velocity, v (lower panel). The SUPERBOX satellite has a mass M sat = 5 x 10 11 , M Q (solid 
curve), 7 x 10 10 M (dash-dotted curve) and 1 x 10 10 M (dashed curve), and orbits in 
an extended isothermal dark halo. The case discussed in Fig. 15 is identical to the present 
SUPERBOX case plotted as the dash-dotted curve. Eq. 30, with InA = 1.6, produces the 
corresponding solutions shown as thin lines. 

It is notable that one single value for the Coulomb logarithm applies to the whole 
radial range as well as to all three satellites, suggesting that non-local perturbations 
of the extended halo do not significantly alter the orbit of the satellites. This finding 
is in nice agreement with the results for InA by Bontekoe & van Albada (1987) and 
Velazquez & White (1999), and support their finding that Chandrasekhar's approx- 
imation yields a useful description of orbital evolution if InA 1.5 is used. 
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Further calculations with more grid cells and a more detailed study will be needed 
to show if the oscillations in r(t) and v(t) seen in Figs. 15 and 16, as well as in 
similar figures in the literature (e.g. fig. 3 in Bontekoe & van Albada 1987) are 
physical, i.e. if they are manifestations of the dependency of rj (Eq. 32) on r and v. 



8 Conclusions 

Superbox is an unconventional particle-mesh code that uses moving sub-grids 
to track and resolve high-density peaks in the particle distribution. This extension 
avoids the limitation in spatial resolution encountered with standard particle mesh 
codes that employ only a single grid. The code is efficient in that the computational 
overhead is kept as slim as possible. The code is also memory efficient by using 
only one set of grids to treat galaxies in succession. 

In this paper we have explained the algorithms used for force evaluation, grid posi- 
tioning and orbit integration. The nearest-grid-point scheme (NGP) is used to com- 
pute the density on a grid. The potential is obtained by FFT, from which accelera- 
tions at the position of each particle within a cell are calculated using second-order 
central-difference quotients with linear interpolation to the position of particles in- 
side that cell. Such a scheme has been identified as a special higher-order NGP 
scheme and is competitive with the cloud-in-cell (CIC) scheme regarding the pre- 
cision and continuity of the forces across grid boundaries. Together with the nested 
sub-grids, this leads to precise and reliable force calculations at the location of each 
particle. The sub-grids are positioned at each time-step on the density maximum or 
at the centre of mass of a galaxy, the former method being the more useful. For 
orbit integration the leap-frog scheme is used. Conservation of energy and angu- 
lar momentum is excellent, and numerical relaxation is significantly longer than a 
Hubble time for a galaxy with a crossing time of about 10 8 yr and particle numbers 
exceeding 10 5 . CPU-timing and memory consumption are such that calculations 
of galaxy encounters involving many x 10 6 particles are possible on present-day 
desk-top computers with 20MB RAM or more (see Fig. 12). On the other hand, it 
is clear that using the resources of massively parallel supercomputers will signifi- 
cantly increase the capabilities of Superbox. 

The most serious limitation of Superbox lies in the inability to resolve newly 
formed self-gravitating systems, because the grid structure is decided on at the 
beginning of a computation. Superbox has presently no scheme to allow the in- 
clusion of additional grids during a calculation. Such objects can form in tidal arms 
and may be the precursors of some dwarf satellite galaxies. This limitation will be 
reduced as the number of grids that can be used increases owing to advances in 
computer technology. 

Current and future developments of Superbox would include for example incor- 
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porating an individual time- step scheme per grid and galaxy. We intend to par- 
allelise the code to run on CRAY T3E super-computers. Furthermore, additional 
grid-levels will be introduced to reduce the jump in resolution between the present 
middle and outer grids, and/or to further increase the resolution of the central re- 
gions of a galaxy. Finally, the current sticky -particle algorithm of Superb ox can 
be extented (e.g. to include star formation). 

Problems that are now being tackled with Superb ox include computations of the 
tidal interactions of satellite and disk galaxies (J.M. Penarrubia, in preparation), and 
the possible formation of spheroidal satellite galaxies from stellar super clusters 
(Fellhauer et al.). Fellhauer & Kroupa (2000) report on a study of the dynamical 
evolution of clusters of dozens to hundreds of young massive and compact star 
clusters (i.e. stellar super clusters) in a tidal field, in an attempt to understand the 
evolution of such objects observed in HST images of the Antennae galaxies (see 
e.g. Lancon et al. 2000). Also, the parameter survey of dwarf galaxies orbiting 
in a massive dark halo is being completed (Kroupa, in preparation), to identify 
regions in parameter space that may lead to dSph-like systems without dark matter. 
Furthermore, collapse calculations are being performed to study the properties of 
the resulting object after violent relaxation and secular relaxation thereafter (Boily, 
in preparation). 
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